Analyzing the range query performance of two partitioning methods in high-dimensional space
نویسندگان
چکیده
Given d-dimensional N data items and a maximum branching factor Bfmax, packing is partitioning Bfmax amount of data and storing it in a disk page. The range query performance of packing is highly dependent on the methods by which data is partitioned. Thus, partitioning data is the main problem tackled in our work. We suggest two extreme cases of partitioning methods: grid-like balanced and onion-peeling-like unbalanced partitioning methods. We analyze them based on the Minkowskisum cost model and present the expected numbers of intersecting pages given d-dimensional hypercubic range queries. Analysis clearly indicates the efficient method of partitioning. By experimentation, we validate our
منابع مشابه
Supervised Feature Extraction of Face Images for Improvement of Recognition Accuracy
Dimensionality reduction methods transform or select a low dimensional feature space to efficiently represent the original high dimensional feature space of data. Feature reduction techniques are an important step in many pattern recognition problems in different fields especially in analyzing of high dimensional data. Hyperspectral images are acquired by remote sensors and human face images ar...
متن کاملFeature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach
Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...
متن کاملMethods to evaluate the performance of kilovoltage cone-beam computed tomography in the three-dimensional reconstruction space
Background: Cone-beam computed tomography (CBCT) scanners for image-guided radiotherapy are in clinical use today, but there has been no consensus on uniform acceptance to verify the CBCT image quality yet. The present work proposed new methods to fully evaluate the performance of CBCT in its three-dimensional (3D) reconstruction space. Materials and Methods: Compared to the traditional methods...
متن کاملمدل جدیدی برای جستجوی عبارت بر اساس کمینه جابهجایی وزندار
Finding high-quality web pages is one of the most important tasks of search engines. The relevance between the documents found and the query searched depends on the user observation and increases the complexity of ranking algorithms. The other issue is that users often explore just the first 10 to 20 results while millions of pages related to a query may exist. So search engines have to use sui...
متن کاملInstitut für Informatik der Technischen Universität München MISTRAL : Processing Relational Queries using a Multidimensional Access Technique
A multidimensional access method offering significant performance increases by intelligently partitioning the query space is applied to relational database management systems (RDBMS). We introduce a formal model for multidimensional partitioned relations and discuss several typical query patterns. The model identifies the significance of multidimensional range queries and sort operations. The d...
متن کامل